Confidence-driven Rewriting for Improved Translation

نویسندگان

  • Shachar Mirkin
  • Sriram Venkatapathy
  • Marc Dymetman
چکیده

Some source texts are more difficult to translate than others. One way to handle such texts is to modify them prior to translation. Yet, a prominent factor that is often overlooked is the source translatability with respect to the specific translation system and the specific model that are being used. We present an approach, and an interactive tool implementing it, where source sentences are rewritten in order to maximize confidence estimates with respect to the translation model. The automatically-generated rewritings are then proposed for the user’s approval. Such an approach can reduce post-editing effort, replacing it by costeffective pre-editing that can be done by monolinguals.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Confidence-based Rewriting of Machine Translation Output

Numerous works in Statistical Machine Translation (SMT) have attempted to identify better translation hypotheses obtained by an initial decoding using an improved, but more costly scoring function. In this work, we introduce an approach that takes the hypotheses produced by a state-ofthe-art, reranked phrase-based SMT system, and explores new parts of the search space by applying rewriting rule...

متن کامل

English-Persian Plagiarism Detection based on a Semantic Approach

Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...

متن کامل

SORT: An Interactive Source-Rewriting Tool for Improved Translation

The quality of automatic translation is affected by many factors. One is the divergence between the specific source and target languages. Another lies in the source text itself, as some texts are more complex than others. One way to handle such texts is to modify them prior to translation. Yet, an important factor that is often overlooked is the source translatability with respect to the specif...

متن کامل

Algebraic Matching of Vulnerabilities in a Low-Level Code

This paper explores the algebraic matching approach for detection of vulnerabilities in binary codes. The algebraic programming system is used for implementing this method. It is anticipated that models of vulnerabilities and programs to be verified are presented as behavior algebra and action language specifications. The methods of algebraic matching are based on rewriting rules and techniques...

متن کامل

Using Automatic Machine Translation Metrics to Analyze the Impact of Source Reformulations

This paper investigates the usefulness of automatic machine translation metrics when analyzing the impact of source reformulations on the quality of machinetranslated user generated content. We propose a novel framework to quickly identify rewriting rules which improve or degrade the quality of MT output, by trying to rely on automatic metrics rather than human judgments. We find that this appr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013